AITopics | style dimension

Collaborating Authors

style dimension

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

architectures

Neural Information Processing SystemsApr-25-2026, 04:58:01 GMT

A.1 Face experiments For the encoder, we use a resnet-50 backbone followed by projection heads that output pointwise, lower and upper quantile predictions. Each projection head consists of a convolution layer followed by a Leaky-Relu activation and a global average pooling layer. The input to each projection head is the output of the backbone network - a feature map of size 512 4 4 and the output dimension is the number of style dimensions - in the case of the pretrained FFHQ styleGAN2 used in our experiments, this value is 9088. For the generator, we use a FFHQ pretrained styleGAN2 trained to output faces of resolution 1024 1024 obtained from the official implementation. No discriminator is used during training.

artificial intelligence, dimension, experiment, (17 more...)

Neural Information Processing Systems

Genre: Research Report (0.35)

Technology: Information Technology > Artificial Intelligence (0.70)

Add feedback

290141d6bfd7ea4d3f4483d126609bf6-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 00:48:54 GMT

The input toeach projection head is the output of the backbone network - a feature vector of size512 and the output dimension is the number of style dimensions - in the case of the pretrained CLEVR styleGAN2 used in our experiments,thisvalueis204. For the generator, we use a modified version of styleGAN2 that is trained to output images of resolution 128 128. For each input image of sizeH W C, we start by generating a random mask of sizeH W where each pixel value in contained in the interval[0,1]. These thresholds were obtained by visual inspection. In this experiment, we set out to measure the variability of the predicted quantile intervals as a functionofproblemdifficulty.

artificial intelligence, dimension, machine learning, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.56)

Add feedback

Reinforcement Learning with Dynamic Multi-Reward Weighting for Multi-Style Controllable Generation

de Langis, Karin, Koo, Ryan, Kang, Dongyeop

arXiv.org Artificial IntelligenceFeb-21-2024

Style is an integral component of text that expresses a diverse set of information, including interpersonal dynamics (e.g. formality) and the author's emotions or attitudes (e.g. disgust). Humans often employ multiple styles simultaneously. An open question is how large language models can be explicitly controlled so that they weave together target styles when generating text: for example, to produce text that is both negative and non-toxic. Previous work investigates the controlled generation of a single style, or else controlled generation of a style and other attributes. In this paper, we expand this into controlling multiple styles simultaneously. Specifically, we investigate various formulations of multiple style rewards for a reinforcement learning (RL) approach to controlled multi-style generation. These reward formulations include calibrated outputs from discriminators and dynamic weighting by discriminator gradient magnitudes. We find that dynamic weighting generally outperforms static weighting approaches, and we explore its effectiveness in 2- and 3-style control, even compared to strong baselines like plug-and-play model. All code and data for RL pipelines with multiple style attributes will be publicly available.

discriminator, language model, style combination, (16 more...)

arXiv.org Artificial Intelligence

2402.14146

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > United States > Iowa (0.04)
(9 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Multi-Reference Neural TTS Stylization with Adversarial Cycle Consistency

Whitehill, Matt, Ma, Shuang, McDuff, Daniel, Song, Yale

arXiv.org Machine LearningOct-25-2019

Current multi-reference style transfer models for Text-to-Speech (TTS) perform sub-optimally on disjoints datasets, where one dataset contains only a single style class for one of the style dimensions. These models generally fail to produce style transfer for the dimension that is underrepresented in the dataset. In this paper, we propose an adversarial cycle consistency training scheme with paired and unpaired triplets to ensure the use of information from all style dimensions. During training, we incorporate unpaired triplets with randomly selected reference audio samples and encourage the synthesized speech to preserve the appropriate styles using adversarial cycle consistency. We use this method to transfer emotion from a dataset containing four emotions to a dataset with only a single emotion. This results in a 78% improvement in style transfer (based on emotion classification) with minimal reduction in fidelity and naturalness. In subjective evaluations our method was consistently rated as closer to the reference style than the baseline. Synthesized speech samples are available at: https://sites.google.com/view/adv-cycle-consistent-tts

dataset, emotion, style dimension, (15 more...)

arXiv.org Machine Learning

1910.11958

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.48)

Add feedback

Anime Style Space Exploration Using Metric Learning and Generative Adversarial Networks

Xiang, Sitao, Li, Hao

arXiv.org Machine LearningMay-21-2018

Deep learning-based style transfer between images has recently become a popular area of research. A common way of encoding "style" is through a feature representation based on the Gram matrix of features extracted by some pre-trained neural network or some other form of feature statistics. Such a definition is based on an arbitrary human decision and may not best capture what a style really is. In trying to gain a better understanding of "style", we propose a metric learning-based method to explicitly encode the style of an artwork. In particular, our definition of style captures the differences between artists, as shown by classification performances, and such that the style representation can be interpreted, manipulated and visualized through style-conditioned image generation through a Generative Adversarial Network. We employ this method to explore the style space of anime portrait illustrations.

artificial intelligence, dimension, machine learning, (17 more...)

arXiv.org Machine Learning

1805.07997

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback